Maine
###We will use Maine to test our various time series models. Using accuracy metrics and plots we will choose the 3 best models and forecast mean monthly wave power for Alaska and Florida.
Observations: Daily mean wave power appears to be highly volatile and masked by noise, which might make it harder to model directly. On the other hand, monthly mean wave power shows a clear seasonal structure with regular peaks and dips, and consistent annual cycles across decades. Moreover, the ACF plot for the daily wave power drops sharply after lag 1, suggesting that past daily values don’t carry much signal for future values. For the monthly wave power, the ACF plot shows strong seasonality with the wave pattern. The autocorrelation persists over time, which will be ideal for ARIMA/SARIMA and other seasonal models.
## Series: maine_ts_train
## ARIMA(2,0,0)(2,0,0)[12] with non-zero mean
##
## Coefficients:
## ar1 ar2 sar1 sar2 mean
## -0.0136 0.1167 0.2955 0.3082 8431.9324
## s.e. 0.0661 0.0606 0.0573 0.0589 622.8468
##
## sigma^2 = 17720534: log likelihood = -2929.12
## AIC=5870.23 AICc=5870.52 BIC=5892.45
##
## Ljung-Box test
##
## data: Residuals
## Q* = 18.166, df = 24, p-value = 0.7949
##
## Model df: 0. Total lags used: 24
Model intial interpretation: - Residuals kind of fluctuate around 0, but
the variance is not very constant, but variance have slightly higher
value fluctuations > 0, so some inconsistency with potential minor
outliers. - ACF plot: no significant spikes outside of the blue lines,
which is good - Histogram of residuals display rough normal distribution
around 0, which is good. - Ljung-Box test: p-value (0.79) much greater
than 0.05, so residuals are white noise, good.
ACF: since the present value has very low correlation with the previous periods in the short term. Since we are looking at tidal power, longer-term trends may be more important, so es model did not do well.
##
## Ljung-Box test
##
## data: Residuals from StructTS
## Q* = 678.76, df = 24, p-value < 2.2e-16
##
## Model df: 0. Total lags used: 24
This model without changes from Module 10 was bad, strong
autocorrelation and convergence issue, let’s change:
## Warning in StructTS(maine_ts_train_diff, type = "BSM", , fixed = c(NA, NA, :
## possible convergence problem: 'optim' gave code = 52 and message 'ERROR:
## ABNORMAL_TERMINATION_IN_LNSRCH'
##
## Ljung-Box test
##
## data: Residuals from StructTS
## Q* = 101.35, df = 24, p-value = 1.764e-11
##
## Model df: 0. Total lags used: 24
Revised model: Residual plot improvements (no trends/outliers)
suggest:
The model handles mean and variance reasonably well.
The seasonal/trend components are likely adequate.
ACF spikes + low p-value imply:
Short-term dependencies remain unmodeled (e.g., AR/MA effects).
Seasonal harmonics (higher-frequency cycles) may be missed.
Next steps could be: - combine the StructTS model with arima layer
Observations: Both TBATS and NN under-forecast monthly tidal power. TBATS performs slightly better but both models do not seem to capture the seasonality of the monthly tidal power. For NN, some tweaking can be done to see if changing the lag can better capture the seasonality.
## The best model by RMSE is: ARIMA+Fourier
| ME | RMSE | MAE | MPE | MAPE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|
| SNAIVE | -822.5207 | 4290.692 | 3202.526 | -19.62825 | 42.60721 | 0.06549 | 0.89498 |
| SARIMA | -217.5947 | 4026.117 | 3221.739 | -29.70686 | 50.01054 | 0.32049 | 0.83462 |
| STL+ETS | -192.9789 | 3730.735 | 2709.232 | -17.21843 | 36.39588 | 0.18831 | 0.70042 |
| ARIMA+Fourier | 736.3832 | 3690.025 | 2601.569 | -5.30294 | 31.99013 | 0.20083 | 0.72561 |
| ES | 4686.0140 | 5864.526 | 4762.718 | 78.82107 | 80.54519 | -0.00122 | 1.84991 |
| StructTS | 4686.0140 | 5864.526 | 4762.718 | 78.82107 | 80.54519 | -0.00122 | 1.84991 |
| TBAT | 569.0860 | 3728.780 | 2690.142 | -8.37019 | 33.99924 | 0.17281 | 0.70592 |
| NN | -358.1549 | 4298.393 | 3113.552 | -19.31054 | 40.91460 | 0.01821 | 0.73875 |
#Alaska
###Start by creating monthly time series objects for Alaska and
plotting ACF and PACF plots
Observations: The significant spike at lag 1 in both ACF and PACF strongly suggests an AR(1) component. Also, the significant spikes at lag 12 in both ACF and PACF indicate a strong seasonal autoregressive component with a period of 12 months (SAR(1) with a seasonal lag of 12).
## Warning in adf.test(alaska_ts_train): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: alaska_ts_train
## Dickey-Fuller = -7.3596, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
Observations: ADF test returns a p-value of 0.01, which is smaller than our chosen significance level of 0.05. We reject the null hypothesis that the Alaska mean monthly wave power series has a unit root and thus, the series is likely stationary and does not need differencing.
###Proceed with using our 3 chosen models on Alaska ## Model 1: STL
decomposition + ETS
Model 2: ARIMA + Fourier terms
Model 3: TBATS
## The best model by RMSE is: ARIMA+Fourier
| ME | RMSE | MAE | MPE | MAPE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|
| STL+ETS | 0.41065 | 14.12462 | 10.06050 | -118.37052 | 148.4886 | 0.24409 | 0.44005 |
| ARIMA+Fourier | 2.70368 | 13.68095 | 9.00760 | -79.91860 | 115.1316 | 0.19990 | 0.39252 |
| TBAT | 3.71047 | 14.20461 | 8.81815 | -71.65192 | 107.4045 | 0.18163 | 0.41341 |
#Florida
###Start by creating monthly time series objects for Florida and
plotting ACF and PACF plots
Observations: The significant spike at lag 1 in both ACF and PACF strongly suggests an AR(1) component. Also, the repeating seasonal patterns at lag 12 for the ACF suggest strong yearly seasonality.
## Warning in adf.test(florida_ts_train): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: florida_ts_train
## Dickey-Fuller = -12.062, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
Observations: ADF test returns a p-value of 0.01, which is smaller than our chosen significance level of 0.05. We reject the null hypothesis that the Alaska mean monthly wave power series has a unit root and thus, the series is likely stationary and does not need differencing.
###Proceed with using our 3 chosen models on Florida
Model 2: ARIMA + Fourier terms
Model 3 TBATs
## The best model by RMSE is: STL+ETS
| ME | RMSE | MAE | MPE | MAPE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|
| STL+ETS | 260.0615 | 1884.797 | 1242.601 | -22.94636 | 48.47699 | 0.04601 | 0.78809 |
| ARIMA+Fourier | 377.5698 | 2019.667 | 1254.629 | -13.78776 | 42.55154 | 0.03404 | 0.85821 |
| TBAT | 644.5316 | 2081.479 | 1271.549 | -4.49300 | 39.17815 | 0.02116 | 0.86973 |